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Abstract: COMPASS is the name of a Computer Aided 
Scheduling System designed and built by McDonnell Dou- 
glas Space Systems Company for NASA. COMPASS can be 
used to develop schedule of activities based upon the tem- 
poral relationships of the activities and their resource 
requirements . COMPASS uses this information , and 
guided by the user, develops precise start and stop times 
for the activities. In actual practice however, it is impossi- 
ble to know with complete certainty what the actual dura- 
tions of the scheduled activities will really be. The best 
that one can hope for is knowledge of the probability dis- 
tribution for the durations . This paper investigates meth- 
odologies for using a scheduling tool like COMPASS that 
is based upon definite values for the resource require- 
ments, while building schedules that remain valid in the 
face of schedule execution perturbations. Representations 
for the schedules developed by these methodologies are 
presented, along with a discussion of the algorithm that 
could be used by a computer onboard a spacecraft to effi- 
ciently monitor and execute these schedules. 

Introduction 

The dictionary definition of robust is “strong and healthy.” 
A robust schedule, therefore would be one which exhibits 
the characteristics we associate with strength and health. 
There are two interesting characteristics of schedule 
strength. The first is the ability of the schedule to accom- 
plish useful work (how much is scheduled), and the sec- 
ond is the ability of the schedule to resist failure due to 
perturbations (the reliability of the schedule). Obviously 
these two characteristics are in competition with each 
other. A densely packed schedule will be more prone to 
failure if activities run long when actually executed. Alter- 
natively, padding the scheduled durations of the activities 


with some extra “slack” time, in order to absorb any per- 
turbations, reduces the number of activities that can fit into 
a fixed length schedule. 

In order to examine the concept of robust schedules, we 
defined metrics that capture these two differing character- 
istics of schedule strength. Many metrics for measuring 
the amount of work that a schedule accomplish have been 
proposed before. In fact, it is these kinds of metrics that 
most schedule optimizers use as their objective function to 
maximize (or minimize). Examples of these kind of met- 
rics include total make span time, summing the values of 
the activities placed on the schedule, mean or total tardi- 
ness, and mean time in process. This paper defines a 
schedule robustness metric that is a measure of the reli- 
ability of the schedule 

Metrics for describing the reliability of hardware items is 
typically described as a Mean Time To Failure (MTTF). 
Furthermore, models exist which describe the expected 
reliability of systems built of component pieces for which 
the stochastic behavior is known, or can be derived. Simi- 
larly, our approach develops a notion of a MTTF for a 
schedule. To do this, we defined a concept of the failure of 
a schedule, and developed a model that describes how to 
calculate the MTTF of a schedule, given a description of 
the stochastic behavior of the activities that make up the 
schedule. 

In order to define the concept of a schedule failure it is 
necessary to describe the overall schedule development 
and execution process. Schedule development begins with 
a set of tasks to be performed, along with there resource 
requirements. In addition, there may be some temporal 
relationships between tasks. For example, one task may 
require that a second task be completed before the first 
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task can begin. Based upon this information, along with 
other information such as the task priority or value, a 
schedule is created. Typically, the schedule that is created 
will be feasible. In other words, the schedule will contain 
no inconsistencies. All the given temporal constraints will 
be satisfied, and none of the resources will be oversub- 
scribed. At execution time, the schedule is used to deter- 
mine what resources should be assigned to what tasks, and 
when. As the schedule is executed, deviations from the a 
priori schedule will occur. If these deviations become too 
large, the schedule will no longer be valid, and a new 
schedule of the remaining tasks must be created. When 
this happens, the original schedule has suffered a failure. 

Schedule development consists basically of assigning 
resources and times for the performance of activities in 
order to meet some deadline. It is well established that the 
Resource Constrained Scheduling decision problem (RCS) 

is NP-complete\ and most scheduling decisions are NP- 
hard. This means that the length of time to develop a 
schedule is of exponential order relative to the number of 
tasks and/or resources. Since RCS is NP-complete, the 
time to verify a particular encoding of a solution to a RCS 
problem is of polynomial order relative to the number of 
tasks and resources, however. This provides the rationale 
for the definition of a schedule failure. When the perturba- 
tions become large enough that a polynomial bound algo- 
rithm can no longer accommodate the deviations, a 
schedule failure occurs, and the NP-hard problem must be 
solved again. 

The remaining question is the representation of the sched- 
ule which can be verified in polynomial time. This paper 
will describe two*schedule representations called the time 
constrained schedule representation and the order con- 
strained schedule representation. These two representa- 
tions can be merged into a single approach to allow the 
schedulers to use their choice of method. 

Time Constrained Schedule Representation 

The standard definition of a RCS problem is a follows: 
Given a set T of tasks / ■, for 1 <i<n> with durations 

defined by a function /: T — » Z + , resource requirements 
/?•: T -» Rq , and resource bounds B t for 1 <i<k , and an 
overall deadline D g Z + ; find (does there exist) a sched- 
ule o: T — » Zq such that 


£ R(t)<B ; for all 0</<« 

(re r f a(t) <j<a (0 + /(r) } and 0<j<D ^2) 

where Z + is the set of positive integers and Rq is the set of 
reals > 0. 

Under this notation, the set T defines the tasks that need to 
be scheduled. The tasks can be scheduled to start at any 
integral value of time between zero and the overall sched- 
ule deadline D. The resource requirements are defined by 
the functions R iy which associate a real value with each 
task for each resource i. The resource bounds B t defines 
the capacity of each resource. The function o defines a 
schedule by assigning to each task an integral start time. 
Equations 1 and 2 guarantee that this schedule satisfies the 
overall deadline and the resource capacity bounds, respec- 
tively. However, this representation does not provide any 
mechanism for handling perturbations in the task dura- 
tions, since only a single integer length is defined for each 
task by the function i 

The time constrained schedule representation extends this 
notation to the probabilistic case by assuming that the task 
length function returns an assumed duration of the activity. 
In general, one can define a family of mappings from the 
probability distribution for the task durations to an 
assumed duration for scheduling by 

l p (0 = min {z e Z I Pr (X t < z) >p} for 0 <p < 1 (3) 

where X t is a random variable equal to the duration of task 
t. This formula defines the assumed duration of a task /, 
with respect to a probability /?, to be the minimum duration 
for which the probability of completing the task is at least 
p . P is called the probability threshold. 

This approach accommodates random variation in the task 
duration by defining a window in which the task can exe- 
cute. The size of this window is controlled by the parame- 
ter p . When p = 1.0, the window is set to the worst case - 
execution time for each task. A value of p = 0.5 would 
set the window for each tas k to the median value of the 
duration probability distribution. When this schedule rep- 
resentation is used by an onboard executive, a task would 
never begin before its assigned start time, as defined the 
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function a. If the actual duration of any task exceeded the 
window defined for that task, we can no longer guarantee 
that the resource and deadline constraints are satisfied 
without resolving a NP-hard problem. Therefore, at this 
time a schedule failure has occurred. Since the boundary 
conditions by which a schedule failure is determined by 
the fixed time windows, this approach to accommodating 
variable duration tasks is called the time constrained repre- 
sentation. 

The time constrained approach provides a simple mecha- 
nism for a real time schedule executive to be able to deter- 
mine when to initiate tasks, while determining if the 
schedule remains valid in light of the actual durations seen 
so far. However, the time constrained representation is 
fairly fragile in terms of its resistance to failure. It is easy 
to see that the probability of a task successfully complet- 
ing within its window is just p . If we assume that the dura- 
tions for the tasks are stochastically independent, the 
probability that all n tasks will complete within their win- 
dows is p n . As n — » oo, p n 0. 


the DAG to accurately depict all the predecessor/succes- 
sor relationships. These are usually drawn as dashed 
edges. The earliest possible start of a task is maximum 
length of all the paths that lead up to the start node for 
the task, where the length of an edge in the path is just 
the corresponding task duration. As the schedule execu- 
tive executes the schedule, the actual durations can be 
substituted for the assumed durations for each task. This 
has the effect that a task can start only when all of its 
predecessors are finished. 

With this idea as the basis for the order constrained 
approach, two questions need to be answered. How is 
the original resource constrained scheduling solution 
converted into a DAG, and how does the executive 
determine if the schedule is still valid based upon the 
DAG and the actual durations so far? 

To illustrate the problems associated with creating a 
DAG from the resource constrained scheduling prob- 
lem, consider the allocation of resource i as shown in 
FIGURE 1, In this figure, the horizontal axis represents 


Order Constrained Schedules 

The fragility of the time constrained approach is due to the 
fact that the schedule is successful if and only if all the 
windows completely surround the actual duration of their 
tasks. There is no capability in this approach for the ran- 
dom variations to “average out.” Even if all but one task 
use less than their allotted time, but the one task exceeds 
its window, a schedule failure will occur. In trying to 
develop an alternative representation which allows for 
increased flexibility by allowing the random variations to 
accumulate and average out, the technique of pert charting 
naturally comes to mind. 

In a pert chart, the schedule is represented as a directed 
acyclic graph (DAG). The DAG is a graphical representa- 
tion of the predecessor - successor partial ordering. There 
are two commonly used representation of the DAG, called 
“activity on node” and “activity on edge.” This paper will 
use the “activity on edge” representation. In the “activity 
on edge” representation of a pert chart, the nodes or verti- 
ces of this graph are called events, and the edges are the 
tasks or activities. If the edges el and e2 are part of a 
directed path through the DAG, in that order, then the task 
associated with el is a predecessor of the task associated 
with e2. Occasionally dummy tasks need to be added to 
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FIGURE 1 Resource timeline 

time, and the vertical axis represents the allocation of 
resource i to the various tasks. In this example, ^ 
t 6> ty, and t 8 each have a resource requirement of 0.33 B y 
Tasks r^and t$ each have a resource requirement of 0.5 
B y The time constrained approach guarantees that the 
sum of the resource requirements of all simultaneously 
executing tasks does not exceed the resource bound. For 
example, the executive would never allow tasks 1, 2 and 
5 to execute simultaneously by ensuring that the win- 
dows for tasks 1 and 2 end before the window for task 5 
begins. The problem for the order constrained approach 
is to define a partial ordering, implemented as a DAG, 
which accomplishes the same goal. 
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One straightforward way of accomplishing this goal is to 
define the partial order relation <• by 

O(ti) + /(*,■) 

In other words, this means that task t x precedes task t 2 if, 
and only if, is scheduled to finish at or before the sched- 
uled start of task / 2 . This in general will create more pre- 
decessor / successor relationships than are necessary, but it 
is a simple matter to go through and remove the redundant 
relations. FIGURE 2 shows the pert chart DAG which 
results from applying this procedure to the schedule in 
FIGURE 1. 


length (X) <D for all paths X in <• (4) 

£*, (') <e,, (5) 

*€ S 

for all /, and for all S £ T such that 

t v f 2 e S=* (f p t 2 ) e <♦ and (/ 2 , /j) e <• 

Equation 4 is the revised constraint that guarantees that the 
partial ordering satisfies the overall deadline requirement. 
Equation 5 ensures that any set of tasks that might execute 
at the same time does not exceed the capacity of any 
resource. 



While it can readily be seen that this partial ordering of the 
tasks will ensure that the resource capacity constraints are 
not exceeded, it can also be seen that it is overly constrain- 
ing. For example, once tasks 1 and 2 complete, task 4 can 
be safely initiated since the resources required by tasks 1 
and 2 are more than enough to satisfy task 4’s requirement. 
Repeated application of this logic will eventually reduce 
the pert graph in FIGURE 2 to the graph shown in FIG- 
URE 3. 



FIGURE 3 Reduced Pert DAG 


Formally, then an order constrained schedule as a solution 
to an RCS problem is defined to be a partial ordering <* of 
the tasks in T such that: 


One final question that needs to be addressed is how to 
calculate the length of a path through the pert network. 
Obviously, the length of the path should be the sum of the 
lengths of the individual tasks, but what value do we use 
for the length of the tasks, since we are assuming that 
these values vary? The a priori assumption, at schedule 
build time, is a task length based upon a probability 
threshold /?, just as in the time constrained case. After the 
completion of the schedule, the a posteriori value of the 
task lengths is just the observed actuals. But what about 
during the execution of the schedule, when there are some 
actuals, and some unknowns? One could just use the a pri- 
ori assumed lengths for the unknown durations. However 
a more general approach is to define a second probability 
threshold q y with 0 <q <p. This defines a new length 
function l q . The parameter q controls the amount of pessi- 
mism about the ability to recover when the actual execu- 
tion is behind the a priori schedule. When <7 = 0, the 
executive will not declare a failure as long as there is some 
possibility of completing the schedule within its overall 
deadline by assuming that all remaining tasks will com- 
plete in their best case, or minimum durations. When 
q = p, the assumption is that the remaining tasks will 
complete in no less time than the a priori assumed dura- 
tions. In either case, when the decision is made that the 
tasks will no longer complete by the overall deadline 
according to the current schedule, a failure Is declared. 

For a given partial order over the tasks of T t it is possible 
to calculate rite length of the longest path, based upon the 
lq length, and starting at the end node of each task t. If the 

actual end time of task t is later than D - max ( l q (X) ) , 
where X is any path starting at f, then at least one path 
through task t will have a path length greater than D. 
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Therefore it is possible to precompute a deadline for each 
task by which time it must the task must complete in order 
for the schedule to meet the overall deadline in light of the 
actuals so far. 

This suggests that it is possible to combine the two sched- 
ule representations into one. The combined representation 
consists of a partial ordering of the tasks of T, the window 
start times defined by the function a (/) , and the window 
end time defined by Q (0 . For a time constrained 
approach, the partial order is empty, and the window start 
and end functions are defined by: 

a (0 = o(0 (6) 

0(0 = o(t) +/(0 (7) 

For an order constrained approach, the partial is deter- 
mined as described above, and the window start and end 
times are defined by: 

a(0 = 0 (8) 

Q(/) = D-max (/ (X)) f9 \ 

X e P K ’ 

where P is the set of all paths starting at the end node of t. 

The job of the onboard schedule executive is to find all 
tasks that have no unfinished predecessors. Once the start 
window has been reached for these tasks, they are initi- 
ated. If any currently executing task fails to finish by its 
window end time, the schedule has failed and must be 
repaired by reinvoking the scheduler. It is fairly easy to see 
that the job of this onboard executive is tractable in the 
sense that it can be completed in a polynomial order of the 
number of tasks. 

Development of Robust Schedules 

Armed with this model of a flexible schedule representa- 
tion than can accommodate some measure of perturbations 
during its execution, it is possible to define a method for 
using a deterministic scheduling system like COMPASS to 
build and manage robust schedules. 

Since resource constrained scheduling is a NP-hard prob- 
lem, COMPASS uses a mixed initiative dialog to generate 
feasible schedules that satisfy the user defined require- 


ments. 2,3 Extending COMPASS to handle uncertain 
requirements, in particular probabilistic task duration, 
should therefore consist of adding commands to allow the 
user to interactively control the risk and uncertainty inher- 
ent in a particular schedule. Specifically, the user must be 
able to view, analyze and modify the risk and uncertainty 
inherent in a particular schedule. Analysis of a given 
schedule can be performed by performing Monte Carlo 
simulation of a large number of possible schedule execu- 
tions to determine the MTTF of the schedule. If either the 
MTTF or the number of tasks the fit in the schedule is 
unacceptable, the user can adjust the a priori duration 
probability threshold and reschedule the tasks. 

Conclusions 

By combining fixed time windows with a pert style prece- 
dence graph, it is possible to build a schedule representa- 
tion that can be executed and monitored by an automatic 
schedule executive in tractable way. Given that the dura- 
tions of the scheduled tasks are not deterministic, but 
instead are represented by probability distributions, it is 
possible to identify probability threshold to control the a 
priori durations to use for scheduling and a posteriori lim- 
its to be monitored against. Given the probability distribu- 
tions of the task durations, it is possible to perform a 
Monte Carlo analysis to determine the MTTF of a given 
schedule. 
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